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Abstract.  We  propose  several  methods  based  on  combinations  of  deflation  techniques  and  polyno¬ 
mial  iteration  methods,  for  computing  small  invariant  subspaces  of  large  matrices,  associated  with 
the  eigenvalues  with  largest  (or  smallest)  real  parts.  We  consider  both  Chebyshev  polynomials  and 
least-squares  polynomials  for  the  acceleration  scheme  and  we  propose  a  deflation  technique  which 
is  a  variant  of  Wielandt’s  deflation  that  does  not  require  the  left  eigenvectors  of  the  matrix.  As  an 
application  we  compare  our  methods  on  an  example  issued  from  a  bifurcation  problem  and  show 
their  efficiency  when  the  number  of  eigenvalues  required  is  small. 
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1.  Introduction 


1.1.  Previous  work  on  solving  large  nonsymmetric  eigenvalue  problems 

Many  important  problems  in  engineering  require  the  computation  of  a  small  number  of  eigen* 
values  with  algebraically  largest  (or  smallest)  real  parts  of  a  large  nonsymmetric  real  matrix  A. 
Of  the  few  typical  examples  reported  in  [26]  we  only  mention  the  important  class  of  bifurcation 
problems  [12],  from  which  we  will  draw  our  main  test  example.  From  the  numerical  point  of  view, 
nonsymmetric  eigenvalue  problems  can  be  substancially  more  difficult  to  deal  with  than  the  sym¬ 
metric  ones.  Perhaps  this  is  one  reason  for  the  lack  of  significant  progress  on  procedures  for  treating 
nonsymmetric  matrix  eigenproblems. 

There  have  been  mainly  three  basic  methods  for  solving  large  nonsymmetric  eigenvalue  prob¬ 
lems  investigated  so  far.  The  first  is  Bauer’s  subspace  iteration  method  and  its  many  variations  [2, 
6,  11,  32,  33,  35].  An  important  drawback  of  this  method,  recognized  both  in  the  symmetric  case 
[17,  18],  and  the  nonsymmetric  case  [27,  26]  is  that  it  may  be  exceedingly  slow  to  converge.  Another 
weakness  of  the  subspace  iteration  method  is  that  it  computes  the  dominant  eigenvalues  of  A,  i.e., 
those  having  largest  modulii,  whereas  in  many  important  applications  it  is  the  eigenvalues  with 
largest  real  parts  that  are  wanted.  This  difficulty,  however,  can  be  obviated  by  using  Chebyshev 
acceleration  as  is  suggested  in  [26].  The  second  method  is  due  to  Arnoldi  [1,  27]  and  is  essentially 
am  orthogonad  projection  method  on  the  Krylov  subspace  {vi,  Avi,..  .Am-1vi}.  Thus,  Arnoldi’s 
method  is  a  generalization  of  the  symmetric  Lanczos  algorithm.  Its  main  drawback  is  that,  unlike 
the  symmetric  Lamczos  algorithm,  the  growth  of  computational  time  and  storage  becomes  excessive 
as  the  number  of  steps  increases.  Variations  on  the  basic  scheme  have  been  proposed  [27],  which 
lead  to  oblique  projection  type  techniques  [28],  but  their  theory  is  not  well  understood  and  we  will 
not  consider  them  here. 

The  third  method  is  the  nonsymmetric  Lanczos  method  [7,  14,  19,  20,  34]  which  is  another 
generalization  of  the  symmetric  Lanczos  algorithm  due  originally  to  Lanczos  himself.  It  produces 
a  tridiagonal  matrix  whose  eigenvalues  can  be  taken  as  approximations  to  the  eigenvalues  of  A.  At 
the  difference  with  Arnoldi’s  method,  this  is  not  an  orthogonal  projection  method,  but  an  oblique 
projection  method  [28].  Parlett,  Taylor  and  Liu  [20]  have  suggested  an  elegant  solution  to  the 
problem  of  breakdown  which  has  given  a  bad  reputation  to  the  Lanczos  method  in  the  past  [35]. 
Cullum  and  Willoughby  [7]  extend  their  symmetric  Lanczos  algorithm  without  reorthogonalization, 
to  the  nonsymmetric  case  and  suggest  a  new  method  for  dealing  with  the  resulting  non-hermitian 
tridiagonal  matrices.  On  the  whole  the  main  difficulty  with  the  Lanczos  method  is  theoretical,  as 
the  method  is  not  quite  well  understood  yet. 

To  these  three  basic  methods  we  should  add  the  important  shift  and  invert  technique  which  is 
not  an  algorithm  in  itself  but  simply  consists  of  transforming  the  original  problem  (A  -  A I)x  =  0 
into  (A  -  <rl)~lx  =  px  which  may  be  easier  to  solve  if  the  shift  is  close  to  some  eigenvalue  of  A. 
Notice  that  there  is  a  trade-off  when  using  shift  and  invert,  because  the  basic  matrix  by  vector 
multiplication  which  is  usually  inexpensive,  is  now  replaced  by  the  more  complex  solution  of  a 
linear  system  at  every  step:  the  factorization  of  the  matrix  (A  -  crl)  is  performed  only  once  and 
then  at  every  step  of  any  of  the  above  methods,  one  solves  two  triangular  systems.  The  cost  of 
the  factorization  can  be  quite  high  and  in  the  course  of  an  eigenvalue  calculation,  one  needs  to  use 
several  shifts,  i.e.  several  factorizations.  Shift  and  invert  has  now  become  a  fairly  standard  tool  in 
structural  analysis  because  there  one  needs  to  solve  the  symmetric  generalized  eigenvalue  problem 
Mx  =  A Kx  and  since  there  is  at  least  one  factorization  to  perform  anyway,  shift  and  invert  no 
longer  appears  expensive  [23,  17,  18,  31]. 

Comparing  the  limitations  of  these  methods,  we  should  emphasize  that  subspace  iteration  is 
only  able  to  compute  a  small  number  of  eigenvalues  and  associated  eigenvectors.  To  some  extent 
Arnoldi’s  method  presents  the  same  limitations  in  practice.  The  nonsymmetric  Lanczos  algorithm 
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(especially  without  reorthogoualization  or  with  some  form  of  partial  reorthogonalization)  is  the 
only  method  that  has  the  potential  of  computing  a  large  number  of  eigenvalues  and  eigenvectors 
of  a  nonsymmetric  matrix  A  [7,  19] .  On  the  other  hand  the  Lanczos  algorithm  requires  the  use 
of  both  the  matrix  A  and  of  its  transpose.  As  will  be  seen  next,  there  are  applications  where  the 
matrix  A  is  not  available  explicitly  but  the  action  of  multiplying  A  by  a  vector  is  easy  to  perform, 
by  use  of  a  finite  difference  formula.  In  those  cases  aF  is  not  available  and  often  cannot  even  be 
approximated  with  finite  differencing. 

1.2.  Motivation 

In  this  paper  we  are  concerned  only  with  the  problem  of  computing  a  very  small  number  of 
eigenvalues  and  their  associated  eigenvectors  or  rather  their  associated  invariant  subspace.  Our 
motivation  is  that  in  most  realistic  applications  the  demand  is  to  compute  a  very  small  number  of 
eigenvalues  of  A  of  (algebraically]  largest  real  parts  or  smallest  real  parts.  In  these  applications 
one  wishes  to  determine  whether  a  certain  system  governed  by  a  partial  differential  equation  of  the 
form 

(i-i) 

where  F  is  a  partial  differential  operator,  and  8  some  real  parameter,  is  stable  for  some  value  of 
the  parameter  9.  Such  a  system  is  said  to  be  stable  if  all  the  eigenvalues  of  the  Jacobian  of  F  with 
respect  to  u,  have  negative  real  parts.  Hence,  all  that  may  be  wanted  here  is  to  compute  one  or  two 
(i.e.  a  comple  pair  of)  eigenvalues.  In  most  bifurcation  problems,  one  is  interested  in  a  singular 
phenomenon  occuring  past  a  few  singular  points,  turning  points  or  bifurcation  points,  but  their 
number  seldom  exceeds  3  or  4.  In  other  words  the  number  of  eigenvalues  to  compute,  i.e.  those 
that  have  nonnegative  real  parts  is,  say,  at  most  4  real  eigenvalues  or  4  complex  conjugate  pairs. 

An  important  observation  is  that  the  Jacobian  matrix  is  often  not  needed  explicitly  when  an 
eigenvalue  algorithm  that  uses  the  matrix  A  only  through  the  matrix  vector  multiplications  y  =  Ax 
is  employed.  This  is  because  the  multiplication  of  the  Jacobian  J,  evaluated  at  the  coordinate  u, 
times  a  vector  x  can  be  performed  at  low  cost  with  the  help  of  the  difference  formula 
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(1.2) 


where  £  is  some  small  and  carefully  chosen  scalar. 

The  approximation  (1.2)  has  been  the  main  instrument  in  the  success  of  the  so-called  matrix- 
free  Ordinary  Differential  Equations  solvers  [3,  5,  9]:  the  Jacobian  is  never  computed  -xplicitly 
which  results  in  significant  savings  both  in  computing  time  and  storage.  A  similar  principle  has  also 
been  employed  by  Eriksson  and  Rizzi  [8]  to  compute  eigenvalues  of  various  semi-discrete  operators 
used  in  compressible  fluid  flow  calculations.  On  the  other  hand,  if  the  eigenvalue  procedure  requires 
the  use  of  AT,  it  is  not  clear  how  one  can  avoid  the  explicit  computation  of  the  Jacobian,  since 
the  transpose  of  the  Jacobian  is  not  easily  approximated  in  a  similar  way.  We  should  add  however, 
that  this  is  only  valid  when  the  function  F(u,8)  is  some  complicated  nonlinear  function  for  which 
derivatives  are  particularly  hard  to  calculate.  Such  examples  abound  in  scientific  applications. 


1.3.  Overview  of  results 

For  the  above  class  of  problems  a  combination  of  polynomial  iteration,  such  as  Chebyshev 
iteration,  with  some  deflation  techniques  are  quite  appropriate.  This  paper  introduces  mainly  two 
methods  based  on  this  combination: 


•  A  deflation  method  which  is  a  particular  case  of  Wielandt  deflation.  An  error  analysis  of  the 
deflation  technique  is  proposed.  Polynomial  iteration  is  used  to  provide  a  good  initial  vector 
for  the  Arnoldi  process.  The  deflation  technique  enables  us  to  compute  one  eigenvalue  or  a 
pair  of  complex  conjugate  eigenvalues  at  a  time. 
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•  A  polynomial  preconditioning  technique  consisting  of  iterating  with  the  matrix  p( A) ,  where 
p  is  a  polynomial,  chosen  so  that  its  eigenvalue  distribution  leads  to  much  faster  convergence 
them  would  be  the  case  with  the  original  matrix  A. 

Instead  of  attempting  to  compute  several  eigenvalues  of  A  at  once  as  was  suggested  in  [27,  26, 
25]  the  class  of  methods  propsed  in  section  2,  consists  of  computing  only  one  eigenvalue  at  a  time 
or  possibly  a  pair  of  complex  conjugate  eigenvalues  at  a  time.  Deflation  is  then  used  to  compute 
the  next  desired  eigenvalues  and  eigenvectors  until  satisfied.  Our  goal  is  to  improve  robustness, 
sometimes  perhaps  at  the  expense  of  efficiency.  The  possible  non-availability  of  the  transpose  of  A 
as  in  the  above  applications,  dictates  that  we  choose  the  deflation  technique  to  be  a  Wielandt-type 
deflation  which  does  not  require  left  eigenvectors.  We  will  show  a  particular  type  of  Wielandt 
deflation  which  is  naturally  suited  for  computing  partial  Schur  forms. 

The  preconditioning  method  proposed  in  Section  4,  rests  on  the  idea  that  all  the  difficulties  in 
Arnoldi  type  methods,  come  from  the  poor  separation  of  the  desired  eigenvalues.  The  real  problem 
is  that  often  the  desired  eigenvalues  are  clustered  while  the  non  wanted  ones  are  well  separated, 
which  results  in  the  method  being  unable  to  retrieve  any  element  of  the  cluster  and  leads  to 
very  poor  performance,  often  divergence.  The  usual  polynomial  acceleration  methods  consist  of 
starting  the  Arnoldi  iteration  with  a  good  initial  vector  which  is  computed  from  a  polynomial 
iteration  of  the  form  z *  =  p(A)zo,  where  p  is  an  appropriately  chosen  polynomial.  However,  in 
some  cases  the  eigenvalue  separation  can  be  so  poor  that  the  Arnoldi  process  seems  even  unable  to 
take  advantage  of  a  good  initial  vector  and  quickly  introduces  unwanted  components.  Our  idea  of 
the  polynonial  preconditioned  Arnoldi  method  is  to  use  the  polynomial  acceleration  differently,  by 
simply  employing  the  polynomial  iteration  as  an  inner  loop  for  the  Arnoldi  process.  In  other  words 
the  matrix  A  is  replaced  by  the  preconditioned  matrix  p(A),  whose  eigenvalue  separation  around 
the  desired  eigenvalue  is  much  better  than  that  of  A. 

In  the  numerical  experiments  section  we  consider  an  example  which  is  a  parameter  dependent 
problem  of  the  sort  described  in  section  1.2.  Problems  of  that  sort  are  numerous  in  structural 
engineering  [4],  in  aerodynamics  (the  panel  flutter  problem  [29]),  chemical  engineering  [10],  fluid 
mechanics  [13]  and  many  other  fields.  Our  goal  is  to  demonstrate  how  the  proposed  methods 
perform  on  matrices  arising  from  a  typical  bifurcation  problem. 

2.  A  Schur- Wielandt  deflation  technique 

In  the  nonsymmetric  case  most  deflation  techniques  require  the  knowledge  of  right  and  left 
eigenvectors.  However,  these  deflation  procedures  of  which  an  example  is  Hotelling’s  deflation, 
can  be  ill-conditioned  because  determining  eigenvectors  of  a  matrix  can  be  itself  an  ill-conditioned 
problem.  In  fact  in  the  defective  case  there  is  no  basis  of  the  invariant  subspace  consisting  of 
eigenvectors  and  therefore  any  numerical  method  that  attempts  to  determine  such  a  basis  will 
likely  be  ill-conditioned.  As  suggested  by  Stewart  [33]  it  is  preferable  to  work  with  Schur  vectors, 
i.e.  with  an  orthonormal  basis  of  the  invariant  subspace,  when  dealing  with  the  nonsymmetric 
eigenvalue  problem.  A  partial  Schur  factorization  is  of  the  form 

AQ  =  QR 

where  Q  is  an  iV  x  p  complex  unitary  matrix  and  R  is  upper  triangular  complex  matrix.  Note  that 
the  order  of  the  eigenvalues  Ai,  A2,  •  •  •  Ap  as  they  appear  in  the  upper  triangular  matrix  R  is  crucial. 
In  fact  for  a  given  order  the  factorization  is  unique  in  the  usual  sense  of  QR  factorizations,  i.e.  the 
columns  of  Q  are  uniquely  determined  up  to  a  sign  of  the  form  e,s .  Thus,  whenever  we  choose  an 
ordering  of  the  eigenvalues,  we  can  deal  with  the  Schur  vectors  without  confusion  in  the  same  way 
that  we  deal  with  the  eigenvectors  of  a  Hermitian  matrix. 


In  this  section  we  describe  a  deflation  technique  which  is  a  simple  variation  of  Wielandt’s 
deflation  and  show  that  it  is  very  suitable  for  computing  orthonormal  bases  of  invariant  subspaces 
and  the  corresponding  partial  Schur  forms.  We  start  our  discussion  with  a  one  vector  deflation 
and  then  we  will  generalize  the  technique  to  several  vectors.  In  the  following  we  denote  by  ||.||  the 
2- norm  in  CN  and  by  Xff  the  transpose  of  the  complex  conjugate  of  a  matrix  X.  Unless  otherwise 
stated  the  eigenvalues  are  ordered  in  decreasing  order  of  their  real  parts  (if  a  conjugate  pair  occur 
then  the  one  with  positive  imaginary  part  is  first).  All  eigenvectors  are  assumed  to  be  normalized 
by  their  Euclidean  norms. 

2.1.  Deflation  with  one  vector 

Suppose  that  we  have  computed  the  eigenvalue  Ax  of  largest  real  part  and  its  corresponding 
eigenvector  «i  by  some  simple  algorithm,  say  algorithm  A  ,  which  always  delivers  the  eigenvalue 
of  largest  real  part  of  the  input  matrix,  along  with  an  eigenvector.  For  example,  in  the  particular 
case  where  all  the  eigenvalues  of  A  are  real  and  positive  Algorithm  A  can  simply  be  the  power 
method.  In  this  section  we  consider  the  simple  case  where  Ax  is  real.  It  is  assumed  that  the  vector 
u\  is  normailized  so  that  ||vx||  =  1.  The  problem  is  to  compute  the  next  eigenvalue  A2  of  A. 
An  old  technique  for  achieving  this  is  what  is  commonly  called  a  deflation  procedure:  a  rank  one 
modification  of  the  original  matrix  is  performed  so  as  to  displace  the  eigenvalue  Ax,  while  keeping 
all  other  eigenvalues  unchanged.  The  rank  one  modification  is  chosen  so  that  the  eigenvalue  A2 
becomes  the  one  with  largest  real  part  of  the  modified  matrix  and  therefore,  Algorithm  A  can  now 
be  applied  to  the  new  matrix  to  retrieve  the  pair  A2,  U2. 

Unlike  many  other  deflation  techniques,  Wielandt’s  deflation  requires  only  the  knowledge  of 
the  right  eigenvector.  The  deflated  matrix  is  of  the  form 

Ax  =  A  -  <ruixH,  '  (2.1) 

where  x  is  an  arbitrary  vector  such  that  xa u\  —  1,  and  a  is  an  appropriate  shift.  It  can  be 

shown  that  the  eigenvalues  of  Ax  are  the  same  as  those  of  A  except  for  the  eigenvalue  Ax  which  is 

transformed  into  the  eigenvalue  Ax  -  <7,  see  [35]. 

The  particular  choice  x  —  ux  has  the  interesting  property  of  preserving  the  Schur  vectors  of 
A.  More  precisely  we  can  state  the  following  proposition. 

Proposition  2.1.  Let  ux  be  an  eigenvector  of  A  of  norm  l,  associated  with  the  real  eigenvalue  Ax 
and  let 

Ax  =  A  -  ffuitif .  (2.2) 

Then  the  eigenvalues  ofAi  are  A,  =  A \-<r  and  A'-  =  Ay,  j  =  2, 3 ... .  N.  Moreover,  the  Schur  vectors 
associated  with  A' ,  j  —  1,2, 3 ...  X  are  identical  with  those  of  A. 

Proof.  Let 

AU  =  UR  (2.3) 

be  the  Schur  factorization  of  A,  where  R  is  upper  triangular  and  U  is  orthonormal.  Then  we  have 
A\U  =  [A  -  ou\Ui\U  =  UR  -  cru\e =  U[R  -  cryexef  ] 

The  result  follows  immediately. 


2.2.  Deflation  with  several  vectors 

Let  ux,U2,...u;  b*.  a  set  of  Schur  vectors  associated  with  the  eigenvalues  Aj,A2,...Ay.  We 
denote  by  U }  the  matrix  of  column  vectors  m,  u2, . . .  uy.  Thus, 

U}  =  [ux,  u2 . «y] 
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is  an  orthonormal  matrix  whose  columns  form  a  basis  of  the  eigenspace  associated  with  the  eigen¬ 
values  Aj,  A2,. .  .Ay.  We  do  not  assume  here  that  these  eigenvalues  are  real,  so  the  matrix  Uj  may 
be  complex.  An  immediate  generalization  of  Proposition  2.1  is  the  following. 

Proposition  2.2.  Let  £y  be  the  p  x  p  diagonal  matrix  Ey  =  Diag{o\,<J2,  ■ . . oy}.  Then  the 
eigenvalues  of  the  matrix 

Ay  =  A  -  UjXjUf, 

are  A'  =  A,-  —  07  for  i  <  j  and  A[  =  A,  for  i>j.  Moreover,  its  associated  Scbur  vectors  are  identical 
with  those  of  A. 

Proof.  Let  (2.3)  be  the  Schur  factorization  of  A.  We  have 

AjU  =  [A  -  UjZjUfp  =  UR  —  UjXjEf, 
where  Ej  =  [ei,  e2, . . .  eyj.  Hence 

AjU  =  -  EjZEf] 

and  the  result  follows. 

I 

It  is  interesting  to  note  that  the  preservation  of  the  Schur  vectors  is  analoguous  to  the  preser¬ 
vation  of  the  eigenvectors  under  Hotelling’s  deflation,  in  the  symmetric  case,  see  [35].  The  above 
proposition  suggests  a  very  simple  incremental  deflation  procedure  consisting  of  building  the  matrix 
Uj  one  column  at  a  time.  Thus,  at  the  j-th  step,  once  the  eigenvector  yy+i  of  Ay  is  computed  by 
the  appropriate  algorithm  A  we  can  orthonormalize  it  against  all  previous  ti,  ’s  to  get  the  next  Schur 
vector  uy+i  which  will  be  appended  to  Uj  to  form  the  new  deflation  matrix  f/y+j.  Clearly,  uy+1  is 
a  Schur  vector  associated  with  the  eigenvalue  Ay+i  and  therefore  at  every  stage  of  the  process  we 
have  the  desired  decomposition 

AUj  =  U j  Rj ,  (2.4) 

where  Rj  is  some  j  x  j  upper  triangular  matrix.  The  corresponding  algorithm  will  be  described  in 
detail  shortly.  v 

With  the  above  implementation,  we  may  have  to  perform  most  of  the  computation  in  complex 
arithmetic.  Fortunately,  when  the  matrix  A  is  real,  this  can  be  avoided.  In  that  case  the  Schur  form 
is  traditionally  replaced  by  the  quasi-Schur  form,  in  which  one  still  seeks  for  the  factorization  (2.4) 
but  simply  requires  that  the  matrix  Rj,  be  quasi-triangular,  i.e.  one  allows  for  2  x  2  diagonal  blocks. 
In  practice,  if  Ay+l  is  complex,  most  algorithms  do  not  compute  the  complex  eigenvector  yy+1 
directly  but  rather  deliver  its  real  and  imaginary  parts  yn,  yi  separately.  Thus  the  two  eigenvectors 
yR  ±  iy/  associated  with  the  complex  pair  of  conjugate  eigenvalues  Ay+i,Ay+2  =  A;+1  are  obtained 
at  once. 

Thinking  in  terms  of  bases  of  the  invariant  subspace  instead  of  eigenvectors,  one  important 
observation  is  that  the  real  and  imaginary  parts  of  the  eigenvector,  generate  the  same  subspace 
as  the  two  conjugate  eigenvectors  and  therefore  there  is  no  point  in  working  with  the  (complex) 
eigenvectors  instead  of  these  two  real  vectors.  Hence  if  a  complex  pair  occurs,  all  we  have  to  do 
is  orthogonalize  the  two  vectors  yR,yj  against  all  previous  u,’s  and  pursue  the  algorithm  in  the 
same  way.  The  only  difference  is  that  the  size  of  Uj  increases  by  two  instead  of  just  one  in  these 
instances. 

We  can  now  sketch  the  Schur-Wielandt  deflation  procedure  for  computing  the  p  eigenvalues  of 
largest  real  parts. 
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Algorithm:  Progressive  Schur-  Wieiandt  Deflation  (PSWD) 

(1)  Initialize: 

j  :=  0,  U0  :=  {0},  E0  :=  0. 

(2)  Compute  next  eigenvector  (s)  : 

Call  algorithm  A  to  compute  the  eigenvalue  Ay+i  (resp.  the  conjugate  pair  of  eigenvalues 
Ay+i,  Ay+2  s  Xj+i)  of  largest  real  part  of  the  matrix  Ay  =  A  -  UjUjUj1,  along  with 
an  eigenvector  y  (resp.  the  real  part  and  imaginary  part  yR,yr  of  the  complex  pair  of 
eigenvectors).  Choose  the  next  shift  <ry+1,  and  define  Ey+i  :=  Diag  (oq,<72,  ■  ■  .<rJ+1}. 

(3)  Orthonormalize: 

Orthonormalize  the  vector  y  (resp.  the  vectors  yR,yi)  against  the  vectors  «j,  «2,  ...uy,  to 
get  uy+i,  (resp.  uy+i,«y+2). 

Set  Uj+i  :=  [Uj,  uy+i],  j  :=  j  +  1,  (resp.  UJ+2  :=  [Uj,  uy+i,  «y+2],  j  :=  j  +  2.) 

(4)  Teat: 

If  j<p  goto  2,  else  set  p  :=  j ,  compute  Rp  :=  Up  AUP  and  exit. 


A  few  additional  details  on  the  implementation  of  each  step  of  the  algorithm  are  now  given. 
First,  we  point  out  that  the  above  algorithm  has  as  a  parameter  the  algorithm  A  ,  which  delivers 
the  eigenvalue(s)  with  largest  real  part(s)  with  its  (their)  associated  eigenvector(s).  We  will  discuss 
various  choices  of  this  algorithm  in  the  next  section.  The  shift  <7y+i  in  step  2,  is  chosen  so  that  the 
eigenvalues  Alt  A2, . . .  Ap  will  in  turn  be  the  ones  with  largest  real  parts  during  the  algorithm.  There 
is  much  freedom  in  choosing  the  shift  but  it  is  clear  that  if  it  is  too  large  then  a  poor  performance 
in  step  2  of  the  algorithm  will  result.  Ideally,  we  might  consider  choosing  <r  so  the  real  part  of 
the  eigenvalue  just  computed,  i.e.  X,  coincides  with  that  of  the  last  eigenvalue  Ajy.  This  yields 
or;+i  =*  Re(Xj  -  Xn).  Clearly,  this  value  is  not  available  beforehand  but  it  suffices  to  have  a  rough 
estimate.  Practically,  we  found  it  convenient  and  not  restrictive  in  any  way  to  take  all  shifts  equal 
to  some  equal  value  <r  determined  at  the  very  first  step  j  =  1.  The  matrix  Ey  then  becomes  <r I. 

For  step  2,  we  will  give  more  detail  in  the  next  sections  on  how  to  compute  the  eigenvectors  y 
or  the  pair  of  conjugate  eigenvectors  yR  ±  iyt .  A  crucial  point  here  is  that  the  matrix  .4y  is  never 
formed  explicitly,  since  this  would  fill  the  matrix  and  is  highly  ineffective.  Clearly,  if  p  is  large 
the  computational  time  of  each  matrix  by  vector  multiplication  becomes  very  expensive.  Another 
potential  difficulty  which  we  consider  in  detail  later  is  the  building  up  of  rounding  errors. 

In  step  3,  several  possibilities  of  implementation  exist.  The  simplest  one  which  we  have  adopted 
in  our  codes  consists  in  a  modified  Gram  Schmidt  algorithm  which  allows  for  up  to  two  reorthogo- 
nalizations  depending  on  level  of  cancellation.  Another  more  expensive  method  of  orthogonalizing 
a  set  of  vectors  which  is  somewhat  more  robust  is  the  Householder  algorithm. 

Before  exiting  in  step  4,  the  upper  triangular  matrix  Rp  is  computed.  For  brievety  we  have 
omitted  to  say  in  the  algorithm  that  one  need  only  to  compute  the  upper  quasi-triangular  part 
since  it  is  known  in  theory  that  the  lower  part  is  zero.  Note,  that  the  presence  of  2  x  2  diagonal 
blocks  requires  a  particular  treatment.  Alternatively,  we  may  compute  all  the  elements  of  the  upper 
Hessenberg  part  of  Rp,  at  a  slightly  higher  cost.  However,  as  will  be  seen  in  Section  2.3,  this  is 
not  necessarily  the  best  choice.  In  the  presence  of  round-off,  the  matrix  Rp  =  Up  AUP  is  slightly 
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different  from  the  Schur  matrix,  and  computing  its  eigenvalues  correponds  to  applying  a  Galerkin 
process  onto  the  subspace  spanned  by  the  block  Up. 

2.3.  Error  Analysis 

In  this  section,  we  propose  a  few  a  posteriori  error  bounds  in  order  to  analyse  the  stability  of 
the  deflation  technique.  Typically,  at  each  step  j  =  1,2, ...p  of  the  deflation  process  we  compute 
an  approximate  eigenvalue  A  y  and  an  associated  normalized  eigenvector  yy  of  the  matrix  Ay_ 1  = 
A  —  As  a  convention  we  define  Ac  to  be  the  matrix  A.  The  approximate  eigenpair 

satisfies  the  relation 

Aj-iyj  =  \jyj  +  i />,  j  -  1, . . . p  (2.5) 

where  the  residual  vector  yy  is  some  vector  of  small  norm  and  is  assumed  to  include  both  the 
effects  of  approximation  and  rounding.  It  is  assumed  that  the  matrix  Up  is  orthonormal  to  working 
precision.  Our  purpose  is  to  provide  some  information  on  the  accuracy  of  the  Schur  basis  Up  and 
possibly  of  the  eigenvalues  obtained  from  the  approximate  eigenvalues  Ay,  j  =  1, . .  .p. 

At  step  number  j,  the  vector  yy  is  orthogonalized  against  tii,u2,...,uy_i  to  obtain  the  jth 
approximate  Schur  vector  tiy.  This  is  realized  by  a  Gram-Schmidt  process  and  as  a  result  the 
following  relationship  between  the  vectors  u,-  and  yy  holds: 

j 

J ^/3,,yu,- =  yy  j  —  1,2,  ...p. 

i=l 

Denoting  by  6y  the  vector  of  p  components  /?i  y,/?2,y, . . .  ,/?yy,0,0, . .  .0,  the  above  relation  can  be 
rewritten  as 

Upbj  =  yy. 

Replacing  this  relation  in  (2.5),  we  have 

(A  -  Uj-i Ey  _ j j ) Upbj  =  XjUpbj  +  yy 


or 

AUpbj  =  U}.^j.xUf_xUpbj  +  A  jUpbj  +  yy.  (2.6) 

Although  there  are  only  p  —  1  shifts  <r,  used  when  p  eigenvalues  are  computed,  it  is  convenient  to 
define  ap  =  0  and 

Sp  =  Diag  {a i , cr2 . crp-i,(Tp}. 

Then  (2.6)  becomes 


AUpbj  =  Up  [£p  +  (Ay  -  a,) I]  bj  +  yy,  j  =  1,2 . p. 

Let  Bp  be  the  p  x  p  upper  triangular  matrix  having  as  its  column  vectors  the  6(s,  Ep  the  N  x  p 
matrix  having  as  its  column  vectors  the  yjs  and  \p  =  Diag{\i,  A2, . . .  Ap}.  Then  the  above  relation 
translates  into  the  matrix  relation: 


AUPBP  =  Up  [ZpBp  +  Z?p(Ap  -  S„)]  +  Ep,  (2.7) 

which  we  rewrite  in  a  final  form  as 

AUP  =  Up  [Sp  +  Bp(\p  -  SP)Z?-']  +  EpB~l .  (2.8) 
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For  convenience,  we  define 


(2.9) 


Zp  =  EpB~l 

and 

Cp  =  — p  +  Ep(\p  —  Ep)£p  l.  (2-10) 

Observe  that  when  <r,-  =  A,  ,  i  =  1,. .  ,p  -  1  then  the  matrix  Up  diagonalizes  partially  the  matrix  A 

dEp  =  0. 

At  the  final  stage  of  Algorithm  PSWD,  there  are  two  ways  of  post  processing  before  exiting. 

•  Either  one  accepts  the  values  A,-, *  =  1,  ...p  as  approximate  eigenvalues  and  does  not  attempt 
to  improve  them.  The  representation  of  the  section  of  A  in  the  approximate  invariant  subspace 
Up  is  taken  to  be  the  matrix  Cp  defined  by  (2.10). 

•  Or  one  performs  a  final  Galerkin  projection  onto  the  subspace  spanned  by  Up  in  order  to 
improve  the  current  approximations.  This  is  done  by  replacing  the  approximate  eigenvalues 
A,-,  i  =  1, . .  .p  by  the  eigenvalues  of  the  matrix  Rp  =  UjJ AUP. 

We  will  mainly  focus  our  attention  on  the  second  approach,  which  is  more  attractive.  In 
this  case  the  Galerkin  process  involves  some  extra  work,  since  the  computation  of  the  matrix 
Rp  itself  costs  us  p2  inner  products.  However,  since  p  is  small  this  is  negligible  as  compared 
with  the  total  work  incurred  during  the  whole  computation.  Note  that  Rp  is  a  full  matrix  with 
small  lower  triangular  part,  and  one  might  still  want  the  partial  Schur  form  corresponding  to  the 
improved  eigenvalues.  This  is  easily  done  by  computing  the  Schur  factorization  of  the  matrix  Rp, 
Rp  =  QpSpQp  and  then  defining  the  new  Up  matrix  by  Upntv  —  UPQP. 

Consider  any  N  x  (IV  —  p)  matrix  W  =  [u/i,  w-i, . . .  Wff-P]  which  complements  the  matrix  Up 
into  an  orthonormal  N  x  N  matrix,  i.e.,  so  that  the  matrix  [ UP,W ]  is  orthonormal.  The  matrix 
representation  of  the  matrix  A  in  this  new  basis  is  such  that 

AP„W]  =  lU„W]^JrZr  £"), 


in  which  „Yi2  =  Up  AW  ,Xw  =  WHAW,  and  Zp,  Rp  have  been  defined  above. 

The  above  equation  indicates  that  [Up,  W]  almost  realizes  a  Schur  factorization  of  A  when  Zp 
is  small.  In  fact,  the  factorization  can  be  rewritten  in  the  following  form: 


A-[UP,W]  (w°2'  [Up,w\H  =  [Up,w\  [Up,w\H.  (2.11) 

When  a  Galerkin  correction  step  is  taken,  then  the  approximate  Schur  factorization  corresponds 
to  taking  Up  as  the  basis  of  the  eigenspace  and  Rp  as  the  representation  of  A  in  that  subspace.  As 
a  consequence,  in  the  approach  using  a  correction  step,  equation  (2.11)  establishes  that  the  final 
result  is  equivalent  to  perturbing  the  initial  matrix  A  by  a  matrix  which  is  unitarily  similar  to  the 
matrix 

(  ° 

\WHZp  oj- 


Thus,  the  eigenvalues  of  R„  will  be  good  approximations  of  those  of  .4  if  they  are  well  conditioned, 
whenever  the  norm  of  W"ZP  is  small.  The  first  case  (no  correction)  can  be  treated  in  the  same 
way  and  one  can  easily  prove  that  the  perturbation  matrix  is  unitarily  similar  to 


u»Zp  o\ 
wHzp  o)' 
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This  analysis  proves  that  the  key  factor  for  the  stability  of  the  deflation  method  is  the  way  in  which 
the  norm  of  Zp  increases. 

We  now  wish  to  provide  a  result  which  establishes  an  a-posteriori  upper  bound  of  the  Frobenius 
norm  of  Zy  as  j  increases.  The  column  vectors  Zj,j  —  1,2, . .  .p  of  Zp  satisfy  the  relation: 

j 

1=1 

from  which  we  derive  the  upper  bound 

^INI<lkll  +  EAylNI- 

1=1 

Using  the  Cauchy-Schwartz  inequality  for  the  last  term  on  the  right-hand  side  we  get 

[7-1  W  p-i 

0:j\\zj\\<\\Vj\\+  E4  -  ElWI2 

.1=1  J  L,=i 

Since  we  have  assumed  that  the  eigenvector  y,,  which  is  orthogonalized  against  the  previous  u's, 
is  of  norm  unity,  an  important  observation  is  that  the  sum  of  the  squares  of  the  /?,y  is  one  and 
/Jyy  represents  simply  the  sine  of  the  angle  0y  between  yy  and  the  subspace  spanned  by  the  vectors 
u,,  i=l,..  .j  —  1.  Therefore,  denoting  by  p,  the  Frobenius  norm  of  the  matrices  Z,,  i  =  1, . .  .  p  the 
above  inequality  reads 

«n(0y)||*y|l  <  ib;||  +  COS (^)p;_,. 

Adding  the  term  sin{0j)pj_x  to  both  sides  and  using  the  inequality  (a2  +  62)1/2  <  a  +  b  for  the 
resulting  left  hand-side  we  obtain 

sin(ej)Pj  <  !k/ll  +  (sinOj  +  cos9,)p3-.x, 

which  is  restated  in  the  following  proposition. 

Proposition  2.3.  The  Frobenius  norms  p3  of  the  matrices  Zj,j  =  1, ...p  satisfy  the  recurrence 
relation 

py  <(1  + COt  +  -M,  (2.12) 

where  93  is  the  a  cute  angle  between  the  eigenvector  yy  obtained  at  the  j,h  deflation  step  and  the 
previous  approximate  invariant  subspace  span{Uj~i}  and  where  is  its  residual  vector. 

It  is  important  to  note  that  since  by  definition  sin  9j  =  &ij  all  the  quantities  involved  in  the 
proposition  are  available  during  the  computation  and  so  the  above  recurrence  is  easily  computable 
starting  with  the  initial  value  po  =  0.  The  result  can  be  interpreted  as  follows:  if  the  angle  between 
the  computed  eigenvector  and  the  previous  invariant  subspace  is  small  at  every  step  then  the  process 
may  quickly  become  unstable.  On  the  other  hand  if  this  is  not  the  case  then  the  process  is  quite 
safe,  for  small  p.  The  interesting  point  is  that  the  above  recurrence  can  practically  be  used  to 
determine  whether  or  not  there  is  such  a  risk  of  instability.  The  cause  of  the  potential  instability 
is  even  narrowed  down  to  the  orthogonalization  process.  If  each  newly  computed  vector  y}  were 
orthogonal  to  the  previous  ones  then  clearly  Bp  would  be  the  identity  matrix  and  there  would  be 


no  risk  of  amplification  of  errors.  This  opens  up  an  interesting  possibility.  Assume  that  instead 
of  computing  an  approximate  eigenpair  Ay,  yy  satisfying  the  relation  (2.5)  one  is  able  by  some 
hypothetical  procedure  to  compute  a  Schur  pair  directly,  i.e.,  a  pair  Ay,uy  satifying  the  analogous 
relation 

i- 1 

Ay_  i  tty  =  Ay  tty  +  ~lij  tt,-  +  t/y .  (2.13) 

i=l 

Then  an  analysis  similar  to  the  one  used  to  establish  (2.8)  would  easily  lead  to  the  relation  AUP  = 
UpRp  +  Ep  where  Rp  is  the  upper  triangular  matrix  having  the  diagonal  elements  A,,i  =  l,p  and  the 
off  diagonal  elements  7,-y,  while  Ep  is  defined  as  before.  Thus,  in  this  case  Zp  is  simply  replaced  by 
Ep  and  the  process  is  always  stable.  In  a  way,  however,  the  difficulty  is  rejected  to  the  hypothetical 
procedure  that  would  compute  the  Schur  pair.  As  an  example,  a  naive  algorithm  for  computing  a 
Schur  pair  would  be  to  compute  the  eigenpair  and  then  orthogonalize  the  eigenvector  yj  against 
the  previous  u(s  to  get  uy.  By  doing  so  a  relation  of  the  form  (2.13)  is  always  satisfied  and  r/y  and 
its  norm  can  be  explicitly  computed.  If  ||ijy||  is  not  sufficiently  small  one  goes  back  to  compute  the 
eigenpair  Ay ,  yy  to  higher  accuracy  until  ||ijy||  is  as  small  as  wanted.  The  issue  of  whether  there 
exists  other  methods  that  delivers  directly  a  Schur  vectors,  is  worth  investigating. 

3.  Deflation  techniques  for  three  basic  methods 

In  this  section  we  review  a  few  methods  for  computing  eigenvalues  and  eigenvectors  of  large 
nonsymmetric  matrices  which  can  be  used  in  the  inner  loops  of  algorithm  PSWD  of  section  2.2.  The 
methods  are  only  briefly  summarized  as  they  have  been  fully  described  elsewhere  in  the  litterature 
[1,27,26,25]. 

3.1.  Arnoldi’s  method  with  deflation 

Arnoldi’s  method  may  not  be  considered  as  a  powerful  technique  in  itself  but  is  a  very  useful 
tool  when  combined  with  other  processes,  such  as  the  ones  to  be  described  in  the  next  sections. 
Starting  with  some  initial  vector  uj  of  Euclidean  norm  1,  the  method  generates  the  finite  sequence 
of  vectors  by  the  recursion: 


> 

hj+ijvj+i  =  Avj  -  Y2  h'>vi'  j =  l' ■  •  -m’  (3-1) 

i=i 

where  h, =  (Ai/y, i/<), i  =  l,...j  and  /iy+lj  is  the  2-norm  of  the  right  hand-side  of  (3.1).  The 
scalars  h,y  are  computed  so  that  the  sequence  iq,  v?  . . .  vm  is  an  orthonormal  sequence,  in  effect  an 
orthonormal  basis  of  the  Krylov  subspace  Km  =  ,span{iq ,  Atq , . . .  .4m_I  tq }. 

Defining  Vm  as  the  .V  x  m  matrix  whose  ith  column  is  the  vector  v,-,  for  t  =  1, 2, . . .  m  and  Hm 
as  the  upper  Hessenberg  matrix  whose  entries  are  the  coefficients  h,}  computed  during  Arnoldi's 
method,  a  simple  consequence  of  the  relation  (3.1)  is  that 

V^AVm  =  Hm.  (3.2) 


Therefore  the  eigenvalues  of  Hm  constitute  the  Galerkin  approximations  of  the  eigenvalues  of  A  on 
the  Krylov  subspace.  Moreover,  the  corresponding  approximate  eigenvectors  are  given  by 


(m) 

r, 


(3.3) 


where  c(m)  is  an  eigenvector  of  the  Hessenberg  matrix  Hm  associated  with  the  eigenvalue  A|m*. 
This  was  the  basis  of  Arnoldi's  original  method  presented  in  [ij.  For  some  details  on  theory  and 
practical  use  of  this  process  see  [27], 
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6.  Summary  and  Conlcusion 

We  have  presented  two  classes  of  methods  for  computing  a  few  eigenvalues  and  the  correspond¬ 
ing  eigenvectors  or  Schur  vectors  of  large  nonsymmetric  matrices.  The  first  comprises  a  deflation 
method  combined  with  any  type  of  polynomial  iteration.  The  second  can  be  viewed  as  a  precon¬ 
ditioned  Arnoldi  method,  whereby  one  uses  Arnoldi’s  method  to  iterate  with  a  polynomial  in  A 
instead  of  A  itself.  These  methods  are  of  interest  only  when  the  number  of  eigenvalues  to  be  com¬ 
puted  is  relatively  small,  such  as  when  dealing  with  the  stability  analysis  in  nonlinear  diferential 
equations,  or  in  the  analysis  of  various  bifurcation  phenomena.  In  those  problems,  a  (few)  right¬ 
most  eigenvalues  of  some  Jacobian  matrix  must  be  computed  in  continuation  type  techniques.  It  is 
clear  that  the  information  gathered  from  previous  continuation  steps  can  be  used  if  the  marching 
parameter  varies  slowly:  in  this  fashion  a  good  initial  vector  for  the  next  run  is  available  as  well 
as  a  good  convex  hull  of  the  unwanted  eigenvalues  and  one  can  expect  a  relatively  moderate  extra 
work  at  each  new  continuation  step. 

The  deflation  technique  can  also  be  of  great  help  when  dealing  with  the  generalized  eigenvalue 
problem.  There,  if  one  uses  an  Arnoldi  (or  nonsymmetric  Lanczos)  method,  big  savings  can  be 
made  by  using  deflation  because  it  allows  to  go  farther  in  the  spectrum  without  having  to  perform 
a  new  factorization  of  K  —  <tM  too  soon.  In  essence  the  selective  orthogonalization  technique 
developed  by  Parlett  and  Scott  [17,  30]  realizes  a  similar  deflation  technique  in  the  symmetric  case 
in  a  more  economical  way.  Our  analysis  of  section  2.3  and  our  experiments  indicates  that  the 
Schur-Wielandt  deflation  is  safe  to  use.  The  a-posteriori  upper  bound  of  of  Proposition  2.3,  can  be 
used  in  practice  to  determine  how  accurate  a  computed  basis  of  an  invariant  subspace  is. 
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Table  1:  Comparison  of  the  estimated  Frobenius  norms  of 
the  errors  in  the  invariant  subspaces  with  the  actual  norms. 

stopping  criterion  as  before  it  took  922  matrix  by  vector  multiplications  for  the  method  ARNLS 
to  converge  with  m  =  30  and  k  =  15.  As  a  comparison  it  took  1204  such  multiplications  for  the 
method  LSARN  used  with  deflation  to  deliver  all  the  three  pairs  of  eigenvalues.  (400  for  the  first 
pair  532  for  the  second  pair  and  272  for  the  last  pair).  The  Chebyshev/Arnoldi  method  performed 
similarly,  taking  a  total  of  1264  matrix-vector  multiplications  for  delivering  the  three  pairs.  In  the 
graph  of  Figure  3,  we  have  plotted  the  convergence  history  of  the  two  methods  ARNLS  and  LSARN. 
In  LSARN  the  degree  of  polynomials  is  100,  i.e.,  we  compound  5  times  a  polynomial  of  degree  20. 
Three  curves  corresponding  to  the  convergence  history  of  each  of  the  three  pairs  of  eigenvalues  are 
drawn.  As  before  we  have  plotted  the  relative  error  (5.5)  of  the  computed  eigenvalue,  versus  the 
accumulated  number  of  matrix  by  vector  multiplications  during  the  run. 

For  the  method  ARNLS  the  eigenvalues  are  computed  simultaneously  and  we  have  therefore 
graphed  the  average  of  the  relative  errors,  over  the  3  pairs.  Here  we  have  taken  m  =  30  and  the 
degree  k  is  set  to  15.  Since  there  is  little  difference  between  the  Chebyshev  method  CHBARN  and 
the  least-squares  method  LSARN,  we  have  omitted  to  plot  the  results  obtained  with  CHBARN. 

5.4.  The  Frobenius  norm  error  bound 

In  this  test  we  verify  the  error  bound  (2.12)  of  section  2.3  and  in  particular  show  how  close 
the  estimate  can  be.  The  test  matrix  is  the  same  as  in  the  preceding  tests  but  of  size  N=100, 
which  corresponds  to  a  discretization  of  n  =  50  interior  mesh  points.  We  have  computed  the  10 
rightmost  eigenvalues  and  their  associated  Schur  vectors  by  using  only  one  method  as  Algorithm 
A  ,  namely  LSARN  with  m  =  10,  and*  polynomial  of  degree  100  =  5  x  20.  Here,  the  stopping 
criterion  for  each  eigenpair  is  that  the  actual  residual  norm  be  less  than  t  =  10-5.  In  other  words 
the  norms  of  the  vectors  »/,  as  defined  by  (2-5)  are  less  than  e  except  for  rounding  in  the  actual 
computation  of  this  residual  which  is  negligible  in  view  of  the  fact  that  e  is  large  compared  to  the 
unit  round-off.  As  soon  as  a  new  pair  of  complex  conjugate  eigenvalues  converged,  we  computed 
the  corresponding  new  Frobenius  norm  of  Z;  and  the  corresponding  estimate  given  by  (2.12).  The 
results  are  shown  in  Table  1.  The  10  rightmost  eigenvalues  are  all  complex  and  so  they  appear  in 
pairs.  In  this  example,  in  fact  in  all  our  tests  conducted  with  this  class  of  test  matrices,  there  is  a 
good  agreement  between  the  estimated  norm  and  its  actual  value. 


RESIOUflL  NORMS 
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8  LSfiRN  HORN  =  30.  ICTCLE=5X20) .  FIRST  PAIR 
C  SECOND  PAIR 
0  THIRD  PAIR 


Figure  3:  Computing  three  pairs  of  eigenvalues  with  two 
methods  for  N  =  200. 
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R  RRNIT  I  ITERATIVE  RRN0LDI).  IRRN  =  30 
8  CHBARN  (CHEBYSHEV-RRN8LDI ) .  IRRN  =  30.  I CYCLE  a  100 
C  LSARN  (LERST  SQURRES/RRN0LOI ) .  IRRN  =  30.  I CYCLE  a  5X20 

0  RRNLS  (LS/PREC0NOITI0NED  RRN0LDI).  IRRN  =  30.  NOEG  a  20 


Figure  2:  Computing  one  pair  of  eigenvalues  with  4  different 
methods  for  N  =  200  . 


21 


HRTRIX-VECT0R  MULTIPLICATIONS 


H  ARNIT  (ITERATIVE  ARN0LDI).  I ARM  =  20 
B  CH8ARN  (CHEBTSHEV-ARNBLDI).  I ARM  *  20.  I CYCLE  =  100 
C  LSARN  (LEAST  SOUARES/ARN0LOI) .  IARN  =  20.  I CYCLE  =  5X20 

0  ARNLS  (LS/PREC0HOITI0NEO  ARNOLD I) .  IARN  =  20.  NOEG  =  20  x 


Figure  1:  Computing  one  pair  with  4  different  methods  for 
N  =  200. 
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Arnoldi  step.  All  methods  are  started  with  the  same  initial  vector  which  is  a  random  vector.  The 
results  are  plotted  in  figure  1. 

ARNIT  did  not  show  any  sign  of  convergence  after  a  total  of  1000  matrix  by  vector  multiplica¬ 
tions.  LSARN  and  CHBARN  perform  similarly  while  ARNLS  differs  significantly  in  that  it  starts 
more  slowly  than  both  LSARN  and  CHBARN  but  as  soom  as  a  good  convex  hull  of  the  eigenval¬ 
ues  is  found,  it  outpaces  all  the  other  methods.  Note  that  for  simplicity  we  have  not  applied  any 
complicated  heuristic  such  as  varying  the  Arnoldi  dimension  m  from  lower  to  high  values  in  order 
to  build  the  convex  hull  H  more  gradually.  The  ICYCLE  paramater  in  the  figure  shows  the  degree 
of  polynomials  used  in  both  CHBARN  and  LSARN.  In  this  test  this  was  set  to  100.  However, 
for  LSARN,  the  polynomial  of  degree  100  is  obtained  by  compounding  (5  times)  a  least  squares 
polynomial  of  degree  20,  as  is  indicated  in  the  figure. 

It  is  difficult  to  select  a  suitable  stopping  criterion  for  nonsymmetric  eigenvalue  problems.  In 
our  case  we  have  adopted  to  stop  as  soon  as  the  residual  norm  is  smaller  that  some  tolerance  e. 
However,  the  matrices  may  be  seeded  differently  and  we  decided  to  scale  the  residual  norms  by 
the  average  singular  value  of  the  Hessenberg  matrix  produced  by  the  projection  process.  More 
precisely,  at  every  step  we  compute  the  square  of  the  Fobenius  norm  fm  =  Trace  (H%Hm),  and 
take  as  an  estimate  of  the  error  of  the  computed  pair  eigenvalue/eigenvector  the  number 


(5.6) 


where  p  is  the  computed  residual  norm  provided  by  the  method.  Note  that  the  denominator 
represents  the  square  root  of  the  average  of  the  squares  of  the  singular  values  of  Hm.  In  ARNLS 
the  same  scaling  is  used  except  that  for  the  projection  step  (Step  4  of  Algorithm  ARNLS),  Hm 
is  replaced  by  the  matrix  Am.  Recall  (27)  that  it  is  not  necessary  to  compute  the  eigenvectors 
explicitly  in  Arnoldi  in  order  to  get  the  residual  norms  because  these  are  equal  to  the  products  of 
hm+i.m  by  the  last  component  of  the  corresponding  normalized  eigenvectors  of  the  matrix  Hm. 

With  this  stopping  criterion,  and  €  =  10"7  the  method  stopped  in  the  order  indicated  in 
Figure  1,  i.e.  LSARN  stopped  first  with  a  total  of  480  matrix  multiplications,  then  CHABRN 
(total  of  620  matrix  vector  multiplications)  closely  followed  by  ARNLS  (total  of  654  matrix  vector 
multiplications)  and  then  ARNIT  (no  convergence).  Thus,  it  is  clear  from  this  example  that  the 
residual  norms  do  not  reflect  the  actual  errors  in  the  eigenvalues.  The  plot  shows  for  exaihple 
that  the  error  estimate  (5.6)  is  an  overestimate  of  the  actual  error  in  all  cases  (since  the  method 
continued  to  run  well  after  this  estimate  went  below  the  10"7  mark  in  the  plot).  The  intriguing  fact 
is  that  it  more  pessimistic  for  the  matrix  B*  than  it  is  for  the  matrix  A.  More  precisely,  at  the  end 
of  the  run  of  ARNLS  the  estimate  (5.6)  was  of  the  order  of  1.91  x  10“8  while,  as  is  indicated  by  the 
plot,  the  error  on  the  eigenvalue  is  9.63  x  10"15.  As  a  comparison,  CHBARN  and  LSARN  showed 
a  smaller  discrepancy:  the  estimate  is  9.1710-8  versus  the  actual  value  5.98  x  10“ 11  for  CHBARN 
and  9.29  x  10“8  versus  5.13  x  10“ 10  for  LSARN.  This  phenomenon  shows  that  the  preconditioning 
technique  does  actually  improve  the  conditioning  of  the  eigenpair:  the  actual  error  is  almost  of  the 
order  of  the  square  of  the  residual  norm  based  error  estimate,  just  like  for  symmetric  matrices. 

As  is  shown  in  the  next  experiment  the  performances  of  the  above  methods  depend  critically 
on  the  values  of  the  parameters  m  (dimension  of  Krylov  subspace  in  Arnoldi’s  method)  and  k  the 
degree  of  the  accelerating  polynomial.  In  the  next  plot  we  shows  the  same  experiment  as  in  the 
previous  one  except  that  the  Arnoldi  dimension  m,  is  set  to  30,  instead  of  20. 

5.3.  Computing  several  eigenvalues 

In  this  test  we  compute  the  6  rightmost  eigenvalues  of  the  same  200  x  200  matrix  .4  as  in  the 
previous  test.  These  6  rightmost  eigenvalues  form  three  complex  conjugate  pairs.  Using  the  same 
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and 


1  Dy  dgh{x,y) 

h2  Li  Tndtag{l%  2,1}+  dy 

respectively,  while  the  blocks  (1,2)  and  (2,1)  are 

dfh(x,y)  j  d  gh(x,y) 
di  “d  — — 

respectively.  Note  that  since  the  two  functions  /  and  g  do  not  depend  on  the  variable  z,  the 
Jacobians  of  either  /*  or  with  respect  to  either  x  or  y  are  scaled  identity  matrices.  We  denote 
by  A  the  resulting  In  x  2 n  Jacobian  matrix.  We  point  out  that  the  exact  eigenvalues  of  A  are 
readily  computable,  since  there  exists  a  quadratic  relation  between  the  eigenvalues  of  the  matrix 
A  and  those  of  the  classical  difference  matrix  Tridiag{l,  -2, 1}. 

For  reference  we  name  ARNIT  the  iterative  Arnoldi  method  of  section  3.1,  CHBARN  the 
Arnoldi-Chebyshev  method  with  deflation  of  Section  3.2,  LSARN  the  least  squares  polynomial 
method  combined  with  Arnoldi  of  Section  3.3  and  ARNLS  the  least  squares  preconditioned  Arnoldi 
method  of  section  4. 

5.2.  Computing  one  pair  of  eigenvalues 

In  this  first  test  we  compare  the  four  methods  described  earlier  to  compute  the  pair  of  eigen¬ 
values  having  largest  real  parts  of  the  2n  x  2n  matrix  A.  We  used  a  discretization  cf  n  =  100 
subintervals,  i.e.,  the  size  the  resulting  matrix  is  200.  We  then  ran  the  four  methods  with  a  size  m 
of  Arnoldi  dimension  equal  to  20,  in  all  cases,  and  in  either  CHBARN  or  LSARN  the  maximum 
degree  polynomial  was  100.  For  ARNLS  the  degree  of  polynomials  was  chosen  to  be  20.  However, 
note  that  the  program  has  the  capability  to  lower  the  degree  by  as  much  as  is  required  to  ensure 
a  well  conditioned  Gram  matrix  in  the  least  squares  polynomial  problem.  This  did  not  happen  in 
this  run  however,  i.e.  the  degree  was  always  20.  We  have  set  the  parameter  indicating  the  number 
of  wanted  eigenvalues  to  NEV  —  1.  Note  that  here  the  eigenvalue  of  largest  real  part  is  complex, 
in  fact  almost  exactly  purely  imaginary,  so  a  reasonable  code  should  deliver  a  pair  of  complex 
conjugate  eigenvalues  in  this  situation. 

The  residual  norms  provided  by  the  first  three  methods  which  deal  with  A  are  not  comparable 
with  those  provided  by  the  method  ARNLS  which  deals  with  B*,  a  polynomial  in  A.  We  therefore 
opted  to  compare  the  computed  eigenvalues  during  the  various'runs  with  the  exact  ones  which 
are  known.  The  exact  eigenvalues,  determined  with  maximum  accuracy  (double  precision),  i.e. 
approximately  16  digits  are 

A, ,2  =  1.8199876787305946  x  lO-5  ±  i  2.139497522076329 

As  is  observed  the  real  part  is  close  to  zero,  which  verifies  the  theory,  within  the  discretization 
errors.  We  have  plotted  the  relative  errors 


for  each  of  the  4  methods. 

For  ARNLS,  the  preconditioned  Arnoldi  method,  the  approximate  eigenvalue  A[m'  was  de¬ 
termined  as  the  Rayleigh  quotient  (.4<£,0)/(<A,<£)  obtained  from  the  approximate  eigenvector  <p  of 
the  matrix  B*,  which  is  easily  computed  within  the  Arnoldi  process  which  is  applied  to  B*.  This 
is  done  every  five  Arnoldi  loops.  For  the  other  methods,  the  error  is  plotted  after  each  adaptive 


two  reacting  and  diffusing  components,  where  0  <  z  <  1  represents  a  coordinate  along  the  tube, 
and  r  is  the  time,  are  modeled  by  the  system:  [21]: 


with  the  initial  condition 


dx  _  Dt  d2x 

Tt  ~ 


+  f{x,y). 


dy_ 

dr 


Dv  /  x 

1 2  qz  2  +y(x^y)' 


(5.1) 

(5.2) 


x(0,z)  =  x0(z),  y(0,z)  =  yo(z),  Vr€  [0,1], 


and  the  Dirichlet  boundary  conditions: 


x(0,r)  =  *(l,r)  =  S 
y(0»0  =  y(l,r)  =  g. 

The  linear  stability  of  the  above  system  is  traditionally  studied  around  the  steady  state  solution 
obtained  by  setting  the  partial  derivatives  of  x  and  y  with  respect  to  time  to  be  zero.  More  precisely, 
the  stability  of  the  system  is  the  same  as  that  of  the  Jacobian  of  (5.1)  -  (5.2)  evaluated  at  the 
steady  state  solution.  In  many  problems  one  is  primiraly  interested  in  the  existence  of  limit  cycles, 
or  equivalently  the  existence  of  periodic  solutions  to  (5.1),  (5.2).  This  translates  into  the  problem 
of  determining  whether  the  Jacobian  of  (5.1),  (5.2)  evaluated  at  the  steady  state  solution  admits  a 
pair  of  purely  imaginary  eigenvalues. 

We  consider  in  particular  the  so-called  Brusselator  wave  model  [21]  in  which 

/(x, y)  *  A-  (J?  +  l)x  +  x2y  (5.3) 

g(x,  y)  =  Bx  -  x2y.  (5.4) 

Then,  the  above  system  admits  the  trivial  stationary  solution  S  —  A,  y  =  B/A.  A  stable  periodic 
solution  to  the  system  exists  if  the  eigenvalues  of  largest  real  parts  of  the  Jacobian  of  the  right 
hand  side  of  (5.1),  (5.2)  is  exactly  zero.  For  the  purpose  of  verifying  this  fact  numerically,  one  first 
needs  to  discretize  the  equations  with  respect  to  the  variable  z  and  compute  the  eigenvalues  with 
largest  real  parts  of  the  resulting  discrete  Jacobian. 

For  this  example,  the  exact  eigenvalues  are  known  and  the  problem  is  analytically  solvable. 
The  article  [21]  considers  the  following  set  of  parameters 

Dx  =  0.008,  Dv  =  ^Dz  =  0.004, 

A  =  2,  B  =  5.45 

The  bifurcation  parameter  is  L.  For  small  L  the  Jacobian  has  only  eigenvalues  with  negative  real 
parts.  At  L  ss  0.51302  a  complex  eigenvalue  appears.  Our  tests  verify  this  fact. 

Let  us  discretize  the  interval  [0, 1]  using  n  +  1  points,  and  define  the  mesh  size  h  s  1/n.  The 
discrete  vector  is  of  the  form  where  x  and  y  are  n-dimensional  vectors.  Denoting  by  Jh  and 
gk  the  corresponding  discretized  functions  /  and  g,  the  Jacobian  is  a  2  x  2  block  matrix  in  which 
the  diagonal  blocks  (1, 1)  and  (2,2)  are  the  matrices 

1  Dx  7W,„„r,  o  i \  I  dfh(x,y) 
h2  p  Tndla9\1'  2’1)+  dx 


:  v'*," *7  ur-K"  9r  v-  ■,  mr  v *rr*r T^r  — tw  *-  ^ -.-■  •_  -j-  -  ■ -  v-  ^  v 
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eigenvalues  of  A  are  the  eigenvalues  of  Am  and  the  approximate  eigenvectors  are  given  by  Vmy\ 
where  yjm^  is  an  eigenvector  of  Am  associated  with  the  eigenvalue  A,.  A  sketch  of  the  algorithm  is 
as  follows. 


The  Preconditioned  Arnoldi-  Least  Squares  Eigenvalue  Algorithm 


(1)  Start: 

Choose  the  degree  k  of  the  polynomial  pt,  the  dimension  m  of  the  Arnoldi  subspaces  and 
an  initial  vector  v\. 

(2)  Initial  Arnoldi  Step: 

Using  the  initial  vector  t/i,  perform  m  steps  of  the  Arnoldi  method  with  the  matrix  A. 

(3)  Projection  Step: 

Obtain  the  matrix  Am  =  V^AVm  and  its  m  eigenvales  {Ai, . . .  Am}  and  eigenvectors  y,. 
Compute  the  approximate  eigenvectors  ut-  =  Vmy ,•  for  i  =  1,2  ...  r  and  their  residual  norms 
pi,i  =  1  ,r.  If  satisfied  then  Stop.  Else 

Adapt:  From  the  previous  convex  hull  and  the  set  {Ar+i,...,Am}  construct  a  new  convex 
hull  of  the  unwanted  eigenvalues. 

Obtain  the  new  least  squares  polynomial  pt  of  degree  k. 

Compute  a  linear  combination  zo  of  the  approximate  eigenvectors  u,-,t  =  l,r. 

(4)  Arnoldi  iteration: 

Perform  m  steps  of  Arnoldi’s  method  with  the  matrix  B*  =  p*(A)  starting  with  t>!  = 
*o/||zo||-  Goto  3. 


When  passing  from  step  2  to  step  3,  it  is  not  necessary  to  actually  compute  the  matrix  Am 
which  is  available  in  step  2  as  the  Arnoldi  matrix  Hm.  The  linear  combination  of  the  approximate 
eigenvectors  in  step  3  is  the  same  as  that  of  the  Hybrid  Arnoldi/Least  squares  method  of  Section 
3.3. 

The  difference  between  this  method  and  that  of  section  3.3  is  that  here  the  polynomial  iteration 
is  an  inner  iteration  and  Arnoldi  is  the  outer  loop,  while  in  the  hybrid  method,  the  two  processes 
are  serially  following  each  other.  Both  methods  can  be  viewed  as  means  of  accelerating  the  Arnoldi 
method. 

It  is  clear  that  the  Schur-Wielandt  deflation  technique  can  also  be  applied  to  the  polynomial 
preconditioned  Arnoldi  method  and  this  is  recommended.  However,  in  our  numerical  experiments, 
we  will  only  compare  the  approach  of  deflation  in  conjunction  with  polynomial  acceleration  in  the 
sense  of  the  previous  sections,  versus  that  of  polynomial  preconditioning  without  any  deflation. 

5.  Numerical  experiments 

All  numerical  tests  have  been  performed  on  a  Vax-785  using  double  precision,  i.e.,  the  unit 
roundoff  is  2-56  1.3877  x  10“ 17 . 

5.1.  The  test  example 

Our  test  example,  taken  from  [21],  models  concentration  waves  in  reaction  and  transport 
interaction  of  some  chemical  solutions  in  a  tubular  reactor.  The  concentrations  x(r ,  z) ,  y(r ,  z)  of 
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4.  A  Polynomial  Preconditioned  Arnoldi  Method 

There  are  various  ways  of  preconditioning  a  linear  linear  system  Ax  =  b  prior  to  solving  it  by  a 
Krylov  subspace  method.  Preconditioning  consists  in  transforming  the  original  linear  system  into 
one  which  requires  fewer  iterations  with  a  given  Krylov  subspace  method,  without  increasing  the 
cost  of  each  iteration  too  much.  For  eigenvalue  problems  similar  methods  have  not  been  given  much 
attention  although  the  shift  and  invert  technique  can  be  viewed  as  a  means  of  preconditioning.  If 
the  shift  a  is  suitably  chosen  the  shifted  and  inverted  matrix  B  =  (A  -  has  a  spectrum 

with  much  better  separation  properties  than  the  original  matrix  A  and  therefore  would  require  less 
iterations  to  converge.  Thus,  the  rationale  behind  shift  and  invert  technique  is  that  factoring  the 
matrix  (A  —  <rJ)  once,  or  a  few  times  during  a  whole  run  in  which  a  is  changed  a  few  times,  is 
a  price  worth  paying  because  the  number  of  iterations  required  with  B  is  so  much  less  than  that 
required  with  A  that  the  cost  of  factorization  is  payed  off.  Essentially  the  same  argument  is  used 
in  the  preconditioned  conjugate  gradient  method  when  dealing  with  linear  systems. 

There  are  instances  where  shift  and  invert  is  essential  and  should  not  be  avoided,  as  for  example 
for  the  generalized  eigenvalue  problems  Ku  =  AA/u.  The  reasons  are  discussed  at  length  in  [18], 
[17],  [23]  and  [31]  the  most  important  one  being  that  since  we  must  factor  one  of  the  matrices 
K  or  M  in  any  case,  there  is  little  incentive  in  not  factoring  ( K  —  aM)  instead,  to  gain  faster 
convergence. 

For  a  classical  eigenvalue  problem,  one  alternative  is  to  use  polynomial  preconditioning  as  is 
described  next.  The  idea  of  polynomial  preconditioning  is  to  replace  the  operator  B  by  a  simpler 
matrix  provided  by  a  polynomial  in  A.  Specifically,  we  consider  the  polynomial  in  A 

Bk  =  Pk(A)  (4.1) 

where  p*  is  a  degree  k  polynomial.  Ruhe  [22]  considers  a  more  general  method  in  which  pt  is  not 
restricted  to  be  a  polynomial  but  can  be  a  rational  function.  When  an  Arnoldi  type  method  is 
applied  to  Bk,  we  do  not  need  to  form  Bk  explicitly,  since  all  we  will  ever  need  in  order  to  multiply 
a  vector  x  by  the  matrix  Bk'vsk  matrix-vector  products  with  the  original  matrix  A  and  some  linear 
combinations. 

For  fast  convergence,  we  would  ideally  like  that  the  r  wanted  eigenvalues  of  largest  real  parts  of 
A  be  transformed  by  pt  into  r  eigenvalues  of  Bk  that  are  very  large  as  compared  with  the  remaining 
eigenvalues.  Thus,  we  can  proceed  as  in  section  3.3,  by  attempting  to  minimize  some  norm  of  pi 
in  some  region  subject  to  the  constraint  (3.6).  Once  again  we  have  freedom  in  choosing  the  norm 
of  the  polynomials,  to  be  either  the  infinity  norm  or  the  £2'norm.  Because  it  appears  that  the 
Xi2-norm  offers  more  flexibility  and  performs  usually  slightly  better  than  the  infinity  norm,  we  will 
only  consider  a  technique  based  on  the  least  squares  approach.  We  should  emphasize,  however, 
that  a  similar  technique  using  Chebyshev  polynomials  can  easily  be  developed.  Therefore,  we  are 
faced  again  with  the  function  approximation  problem  described  in  Section  3.3. 

Once  the  polynomial  p*  is  calculated  the  preconditioned  Arnoldi  process  consists  in  using 
Arnoldi’s  method  with  the  matrix  A  replaced  by  Bk  =  Pk{A).  This  will  provide  us  with  approxi¬ 
mations  to  the  eigenvalues  of  Bk  which  are  related  to  those  of  A  by  A,(Bi)  =  pjt(A,(A))  It  is  clear 
that  the  approximate  eigenvalues  of  A  can  be  obtained  from  the  computed  eigenvalues  of  Bk  by 
solving  a  polynomial  equation.  However,  the  process  is  complicated  by  the  fact  that  there  are  k 
roots  of  that  equation  for  each  value  A ,(B*)  that  are  candidates  for  representing  one  eigenvalue 
A, (A).  The  difficulty  is  by  no  means  unsurmontable  but  we  have  preferred  a  more  expensive  but 
simpler  alternative  based  on  the  fact  that  the  eigenvectors  of  A  and  Bk  are  identical.  At  the  end  of 
the  Arnoldi  process  we  obtain  an  orthonormal  basis  Vm  which  contains  all  the  approximations  to 
these  eigenvectors.  A  simple  idea  is  to  perform  a  Galerkin  process  for  A  onto  Span[Vm]  by  explicitly 
computing  the  matrix  Am  =  V £ AVm  and  its  eigenvalues  and  eigenvectors.  Then  the  approximate 


The  Hybrid  Least-Squares  Arnold!  Algorithm 


(1)  Start: 

Choose  the  degree  k  of  the  polynomial  pt,  the  dimension  m  of  the  Arnoldi  subspaces  and 
an  initial  vector  v\. 

(2)  Projection  step: 

Using  the  initial  vector  t/iz  perform  m  steps  of  the  Arnoldi  method  and  get  the  m  approx¬ 
imate  eigenvalues  {Ai, . . .  Am}  of  the  matrix  Hm. 

Estimate  the  residual  norms  i  =  l,r,  associated  with  the  r  eigenvalues  of  largest  real 
parts  {Ai,...Ar}  If  satisfied  then  Stop.  Else 

Adapt:  From  the  previous  convex  hull  and  the  set  {Ar+j, . . .  Ay}  construct  a  new  convex 
hull  of  the  unwanted  eigenvalues. 

Obtain  the  new  least  squares  polynomial  of  degree  k. 

Compute  a  linear  combination  zq  of  the  approximate  eigenvectors  u, ,  t  =  l,r. 

(3)  Polynomial  iteration: 

Compute  zt  —  Pk(A)zo.  Compute  v\  =  Zfc/||zfc||  and  goto  2. 


Many  practical  details  are  omitted  and  are  discussed  at  length  in  [25] .  We  only  mention  that 
the  linear  combination  at  the  end  of  step  3,  is  usually  taken  as  follows: 

r 

ZQ  -  ^2  Pi^i 

i=l 

in  which  the  vectors  u,  are  the  normalized  approximate  eigenvectors  and  p,-  are  their  residual  norms. 
The  effect  of  this  heuristic  choice  is  twofold.  First  it  avoids  complex  arithmetic  when  the  matrix  A  is 
real,  because  then  the  vector  zo  is  always  real.  Second,  it  avoids  the  damaging  effects  of  unbalanced 
convergence  by  putting  more  emphasis  on  the  eigenvectors  that  are  slower  to  converge.  In  fact  here 
lies  a  weaknesses  similar  to  that  of  the  restarted  Arnoldi  method  mentioned  in  section  3.1.  It  is 
difficult  to  choose  a  linear  combination  that  leads  to  balanced  convergence  because  it  is  difficult 
to  represent  a  whole  subspace  by  a  single  vector.  This  translates  into  divergence  in  many  cases 
especially  when  the  number  of  wanted  eigenvalues  r  is  not  small.  There  is  always  the  possibility 
of  increasing  the  space  dimension  m,  at  a  high  cost,  to  ensure  convergence  but  this  solution  is  not 
always  satisfactory  from  the  practical  point  of  view. 

Use  of  deflation  constitutes  a  good  remedy  against  this  difficulty  because  it  allows  us  to  compute 
one  eigenvalue  at  a  time  which  is  much  easier  than  computing  a  few  of  them  at  once.  Another 
solution  is  to  improve  the  separation  of  the  desired  eigenvalues  by  replacing  A  by  a  polynomial  in 
A.  This  approach,  referred  to  as  polynomial  preconditining  will  be  presented  in  the  next  section. 

One  attractive  feature  of  the  deflation  techniques  is  that  the  information  gathered  from  the 
determination  of  the  eigenvalue  A,  can  be  used  to  help  iterate  when  computing  the  eigenvalue  A,+i. 
The  simplest  way  in  which  this  is  achieved  is  by  using  at  least  part  of  the  convex  hull  determined 
during  the  computation  of  A,  .  Moreover,  a  rough  approximate  eigenvector  associated  with  A,+1  can 
be  inexpensively  determined  during  the  computation  of  the  eigenvalue  A,  and  then  used  as  initial 
vector  in  the  next  step  for  computing  A,+j. 
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in  which  the  =  1  ,...r  constitute  r  different  weights.  Since  it  is  known  that  the  maximum 

modulus  of  an  analytic  function  over  a  region  of  the  complex  plane  is  reached  on  the  boundary 
of  the  region,  one  solution  to  the  above  problem  is  to  minimize  an  L2-iiorm  associated  with  some 
weight  function  w,  over  all  polynomials  of  degree  k  satisfying  the  constraint  (3.6).  We  need  to 
choose  a  weight  function  u>  that  leads  to  easy  computations  in  practice. 

Let  the  region  H  of  the  complex  plane,  containing  the  unwanted  eigenvalues  Ar+i, . . .  A/v,  be  a 
polygon  consisting  of  p  edges  £1, £2, ..  .E^,  each  edge  Ej  linking  two  successive  vertices  hy_i  and 
hj  of  H.  Denoting  by  c,  =  \{hj  +  h;_i)  the  center  of  the  edge  Ej  and  by  dj  =  \(hj  -  hj-i)  its 
half-width,  we  define  the  following  Chebyshev  weight  function  on  each  edge: 


^(A)  =  |i^-(A-C>)2rl/2 


(3.7). 


The  weight  ui  on  the  boundary  dH  of  the  polygonal  region  is  defined  as  the  function  whose  restric¬ 
tion  to  each  edge  Ej  is  ojj.  Finally,  the  Z.2-inner-product  over  dH  is  defined  by 


<  P,q  >*=  f  p(A)<j(A)w(A)|dA| 
J  dH 


and  the  corresponding  L2-uorm  is  denoted  by  ||.||w. 

Theorem  3.1.  Let  {ir,},=o,jfc  be  the  first  k  +  1  orthonormal  polynomials  with  respect  to  the  L?-inner- 
product  (3.8).  Then  among  all  polynomials  p  of  degree  k  satisfying  the  constraint  (3.6),  the  one 
with  smallest  u-norm  is  given  by 


=  £  /  P(A)?(A)wj(A)|dA|  (3.8), 

jm  1  JE> 

Then  we  have  the  following  result  [25]. 


PkW  = 


ELol^.l2 


(3.9) 


where  =  ^=1  Pi»i(A>)- 

On  the  practical  side  the  process  of  constructing  the  orthogonal  polynomials  can  be  difficult 
and  unstable  if  not  enough  care  is  exerted  in  the  computation.  In  [25]  and  [24],  the  orthogonal 
polynomials  as  well  as  their  linear  combination  (3.9)  are  all  expressed  in  terms  of  a  Chebyshev  basis 
associated  with  the  ellipse  of  smallest  area  containing  H.  Then,  the  moment  matrix  A/*  associated 
with  this  basis  is  constructed  without  any  numerical  integration.  The  solution  (3.9)  is  then  obtained 
by  solving  a  linear  system  with  this  Gram  matrix.  The  tedious  details  on  the  practical  computation 
may  be  found  in  [25] . 

The  resulting  hybrid  method  for  computing  the  r  eigenvalues  with  largest  real  parts  is  outlined 
next. 
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Arnoldi’s  method.  We  can  then  reproduce  the  previous  steps  with  this  vector  as  iq.  This  process 
is  repeated  until  the  whole  set  of  desired  eigenvalues  has  converged. 

This  combination  constitutes  an  extremely  ~owerful  technique  if  only  the  eigenvalue  of  largest 
real  part  is  to  be  computed.  When  several  of  them  must  be  computed  then  the  restarting  procedure 
may  lead  to  some  difficulties.  See  [26]  for  details  on  this  process  and  numerical  experiments.  Again 
reliability  will  be  improved  by  simply  computing  one  eigenpair  at  a  time  and  using  the  Schur- 
Wielandt  deflation. 

3.3.  The  Least  Squares  -  Arnoldi  method  with  Deflation 

The  choice  of  ellipses  as  enclosing  regions  in  Chebyshev  acceleration  may  be  overly  restrictive 
and  ineffective  if  the  shape  of  the  convex  hull  of  the  unwanted  eigenvalues  bears  little  resemblance 
with  an  ellipse.  This  was  the  motivation  of  the  work  [25]  in  which  the  acceleration  polynomial  is 
chosen  so  as  to  minimize  an  i^-norm  of  the  polynomial  p  on  the  boundary  of  the  convex  hull  of 
the  unwanted  eigenvalues  with  respect  to  some  suitable  weight  function  u.  The  only  restriction 
with  this  technique  is  that  the  degree  of  the  polynomial  is  limited  because  of  cost  and  storage 
requirement.  This,  however,  is  overcome  by  compounding  low  degree  polynomials.  The  stability  of 
the  computation  is  enhanced  by  employing  a  Chebyshev  basis  and  by  a  careful  implementation  in 
which  the  degree  of  the  polynomial  is  taken  to  be  the  largest  one  for  which  the  Gram  matrix  has  a 
tolerable  conditioning.  The  method  for  computing  the  least  squares  polynomial  is  fully  described 
in  [25]  but  we  present  a  summary  of  its  main  features  below. 

Suppose  that  we  are  interested  in  computing  the  r  eigenvalues  of  largest  real  parts  Ai,  A2, . . .  Ar 
and  consider  the  vector 

Zk  =  Pk(A)zo  (3.4) 

where  pt  is  a  degree  k  polynomial.  If  A  is  diagonalizable,  then  by  writing  the  expansion  of  zq  in 
the  basis  of  eigenvectors  u,  of  A  as 

N 

zo  =  ^2 
1=1 

we  get  Zk  =  YiiLi  ^.Pfc(A,  )u,  which  we  separate  in  two  parts 

r  N 

zk  ~  £<Pfc(A,)U|-  4-  y  ]  £iPk(^i)ut  (3.5) 

1=1  i=r+l 

The  principle  of  the  hybrid  least  squares- Arnoldi  method  is  to  use  the  vector  Zk  as  an  initial 
vector.  From  the  analysis  of  the  Arnoldi  process,  it  is  clear  that  we  want  the  second  part  of  the 
above  expansion  to  be  small  compared  with  the  first  part.  In  fact  it  can  be  proved  [26]  that 
if  the  second  part  is  zero  then  the  Arnoldi  process  will  stop  at  the  rth  step  with  Kr  becoming 
the  invariant  subspace  associated  with  the  eigenvalues  Ai,A2,...Ar.  Therefore,  we  wish  to  choose 
among  all  polynomials  p  of  degree  <  k  one  for  which  p(A,-),i>r  are  small  relative  to  p(A,),  i  <  r. 

Assume  that  by  some  adaptive  process,  a  polygonal  region  H  which  encloses  the  remaining 
eigenvalues  becomes  available  to  us.  We  then  arrive  at  a  function  approximation  problem,  which 
roughly  formulated  consists  of  finding  a  polynomial  of  degree  k  whose  value  inside  some  region  is 
small  while  its  values  at  r  particular  points  (possibily  complex)  are  large.  For  a  more  precise  formu¬ 
lation  we  begin  by  normalizing  the  polynomial  at  the  points  Ai,A2,...Ar.  One  such  normalization 
is 

r 

X>;P(Ay)  =  1  (3.6) 

>-1 
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A  major  limitation  of  Arnoldi’s  method  is  that  its  cost  and  storage  requirements  increases 
drastically  as  the  number  of  steps  m  required  for  convergence  increases.  An  immediate  remedy  for 
this  is  to  use  restarting:  after  the  m  steps  are  performed  one  restarts  with  an  initial  vector  formed 
from  a  linear  combination  of  the  eigenvectors  (3.3)  associated  with  the  desired  eigenvalues.  This 
was  proposed  as  a  simple  alternative  to  the  classical  method  in  [27]  and  was  further  improved  by 
the  incorporation  of  polynomial-based  acceleration  techniques  in  [26]  and  [25]. 

However,  the  restarting  method  may  encounter  some  difficulties  especially  in  cases  when  the 
number  of  wanted  eigenvalues  is  not  small.  One  way  in  which  this  inefficiency  manifests  itself  is 
that  when  restarting,  the  process  is  often  unable  to  keep  the  accuracy  gained  in  the  previous  steps 
for  all  eigenvalues,  i.e.,  the  accuracy  may  improve  in  some  eigenvalues  but  deteriorates  in  some 
others.  It  is  difficult  when  the  number  of  eigenvalues  is  not  small  to  make  the  method  produce  a 
similar  accuracy  for  all  the  wanted  eigenvalues.  This  is  why  deflation  is  so  important.  Very  often  it 
is  possible  to  recover  convergence  by  using  a  larger  number  of  steps  in  the  iterative  Arnoldi  method, 
but  this  is  not  always  desirable  as  the  storage  requirement  increases  drastically. 

Since  Arnoldi’s  method  is  relatively  successful  in  computing  the  eigenvalue  of  largest  real  part 
of  a  large  nonsymmetric  matrix,  we  can  improve  the  reliability  of  the  method  by  always  computing 
one  eigenvalue  and  its  eigenvector  at  a  time,  i.e.  by  never  attempting  to  extract  more  than  one 
eigenpair  at  a  time.  This  can  be  achieved  by  using  the  deflation  algorithm  PSWD,  in  which 
algorithm  A  is  simply  replaced  by  the  restarted  Arnoldi. 

3.2.  The  Chebyshev  -  Arnoldi  method  with  deflation 

One  way  of  avoiding  the  weaknesses  of  Arnoldi’s  method  is  to  use  a  good  initial  vector,  i.e. 
an  initial  vector  for  the  total  number  of  Arnoldi  iteration  is  small.  This  can  be  achieved  by 
preprocessing  the  initial  vector  by  a  polynomial  type  iteration  before  feeding  it  into  the  Arnoldi 
algorithm.  The  question  of  course,  is  how  to  select  a  good  polynomial.  The  basic  principle  of 
polynomial  acceleration  techniques  is  to  start  by  enclosing  the  set  of  unwanted  eigenvalues  in  some 
domain  and  then  find  a  polynomial  which  has  a  small  modulus  in  that  domain  comparatively  to  its 
modulus  on  the  wanted  eigenvalues.  In  Chebyshev  acceleration  the  enclosing  domain  is  an  ellipse 
and  the  basic  idea  is  to  minimize  the  maximum  modulus  of  a  polynomial  p  over  that  ellipse  subjset 
to  the  constraint  that  p(At)  =  1,  where  A*  is  the  eigenvalue  of  largest  real  part,  assumed  to  be  real 
here  for  simplicity.  This  approach  which  leads  naturally  to  Chebyshev  polynomials,  was  considered 
by  Manteuffel  who  uses  it  as  the  basis  for  an  iterative  method  for  solving  nonsymmetric  linear 
systems  [15,  16]. 

In  [26]  we  have  described  a  hybrid  method  based  on  a  combination  of  Chebyshev  iteration 
and  Arnoldi’s  method  for  computing  eigenvalues  and  eigenvectors  of  nonsymmetric  matrices.  The 
algorithm  described  in  [26]  attempts  to  compute  simultaneously  the  set  of  p  eigenvalues  of  largest 
(or  smallest)  real  parts.  Schematically,  the  algorithm  runs  as  follows.  Initially,  we  perform  m  steps, 
of  Arnoldi’s  method  where  m>p  is  fixed,  starting  with  a  random  initial  vector.  This  computes  a  set 
of  m  approximate  eigenvalues  which  are  split  in  two  parts:  the  set  of  wanted  eigenvalues  (i.e.  the 
p  approximate  eigenvalues  with  largest  real  parts)  and  that  of  the  unwanted  ones  (the  remaining 
approximate  eigenvalues).  From  the  set  of  unwanted  eigenvalues,  one  builds  a  polygonal  convex 
hull  that  contains  all  the  unwanted  eigenvalues.  Then  the  parameters  of  an  ellipse  containing 
that  convex  hull  are  computed.  The  parameters  of  the  computed  ellipse  are  optimal  in  a  certain 
sense.  Using  these  parameters,  a  certain  number  of  steps  of  Chebyshev  iteration  are  performed 
starting  with  a  certain  linear  combination  of  the  approximate  Arnoldi  eigenvectors  associated  with 
the  wanted  eigenvalues,  as  an  initial  vector.  The  effect  of  the  Chebyshev  iteration,  is  to  damp 
the  eigencomponents  associated  with  the  eigenvalues  inside  the  ellipse  contaning  the  convex  hull 
of  unwanted  eigenvalues  while  highly  amplifying  those  components  associated  with  the  wanted 
eigenvalues.  As  a  result  the  final  vector  of  this  Chebyshev  iteration  is  a  perfect  candidate  for 
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